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ABSTRACT 

A  system  of  parallel  processes  is  said  to  be  asynchronous 
if  each  process  has  its  own  independent  clock.   We  show  here 
how  prune  to  restrictions  on  relative  speeds  of  processes  are 
differences  in  efficiency  between  synchronous  and  asynchronous 
systems,  indicated  by  other  researchers  in  the  past  (see 
[Arjomandi,  Fisher,  Lynch,  81].   For  any  s,n,  a  particular 
distributed  problem  (the  [s ,n] -session  problem)  (defined  in 
[Arjomandi,  Fischer,  Lynch,  81])  requires  time  at  least  (s-1) 
(logn/v)  in  any  asynchronous  system  with  relative  speed  ratio  v 
and  assuming  the  concurrency  is  modellled  as  nondeterminism  in 
a  single  sequence  of  steps  of  processes.   The  same  problem  requires 
at  least  0(s  loglog  n)  time  in  any  asynchronous  system  with 
quadratic  relative  acceleration  and  the  same  model  of  concurrency. 
In  general,  if  we  can  partition  the  "distributed  time"  t, 
required  to  solve  the  [s ,n] -session  problem,  in  m(t)  segments 
(non-overlapping)  of  sizes  a (1) , . . . ,a (m) ,  such  that  in  each 
segment,  a  message  written  on  any  variable  t  cannot  be  propagated 
to  more  than  n-1  variables  in  the  duration  of  the  segment,  then 
a   lower  bound  for  time  is  the  smallest  t  satisfying  m(t)  >^  s. 

On  the  contrary,  the  [s ,n] -session  problem  can  be  solved  in 
time  s  in  a  synchronous  system  and  in  time  S'V  in  an  asynchronous 
system  of  relative  speed  ratio  v  whose  model  of  concurrency  is 
only  a  partial  order  of  steps  of  processors  and  "real"  parallelism 
is  allowed. 


1.   Introduction 

[Arjomandi,  Fischer,  Lynch,  81]  showed  a  lower  bound  for 
the  distributed  time  needed  to  solve  the  [ s ,n] -session  problem 
on  any  asynchronous  system.   This  bound  is  O(slogn).   The  systems 
considered  were  assumed  to  have  no  restrictions  in  speed  behavior 
of  processes  but  the  number  of  processes  that  could  access  any 
particular  communication  channel  (shared  variable)  was  bounded. 

We  indicate  here  that  additional  knowledge  about  relative 
processor  speeds  lowers  the  bounds  presented  in  [Arjomandi, 
Fischer,  Lynch,  81] .   We  also  indicate  that  seemingly  identical 
restrictions  on  speeds  may  lead  to  different  performance  when 
the  model  of  concurrency  is  changed. 

The  above  seems  to  imply  that  the  notion  of  "minimal  round" 
(see  [Arjomandi,  Fischer,  Lynch,  81]  as  a  measure  of  time  in  a 
distributed  system  is  not  suitable  for  cases  in  which  additional 
knowledge  about  number  of  steps  of  processes  in  each  round  exists. 

2.   LOWER  BOUNDS 

2.1   The  Model  Where  Concurrency  is  Nondeterminism. 

We  follow  here  the  model  of  [Arjomandi,  Fischer,  Lynch,  81] 
with  some  additional  restrictions  on  the  admissible  computations. 
(See  also  [Lynch,  Fischer,  81]),   A  concurrent  system  is  a  collection 
P  of  processes  and  X  of  shared  variables.   The  global  state 
consists  of  the  internal  state  of  each  process  together  with  the 
value  of  each  shared  variable.   A  step  o  is  an  atomic  action 
consisting  of  simultaneous  changes  to  the  state  of  some  process 
and  the  value  of  some  shared  variable  i.e.  a  pair  (  (s ,p  ,  t)  (u,x , v) ) 


where  s  and  t  are  the  states  of  process  p  before  and  after  the 
execution  of  the  step  o  and  u,v,  are  the  values  of  variable  x 
before  and   after  the  execution  of  step   a.   Step  o  is  applicable 
to  any  global  state  in  which  process  p  is  in  state  s  and  variable 
X  has  the  value  u.   VJe  define  process  (a)  to  be  p  and  variable  (a)  to 
be  X.    A  system  is  specified  by  P,X,  an  initial  global  state  and 
a  set  OKSTEPS  of  possible  steps.   A  process  p  blocks  in  a  global 
state  g  if  there  is  no  step  a  in  OKSTEPS  applicable  to  g  with 
process  (a)  =  p.   Let   x€x  and  define 

locality  (x)  =  {process (a):  a  €  OKSTEPS  and  variable (a)  =  x} 
A  computation  is  a  (finite  or  infinite)  sequence  of  steps  in  OK 
steps. 

An  infinite  computation  is  admissible  if  every  process  appears 
infinitely  often  in  the  sequence.   A  round  is  any  sequence  of  steps 
such  that  every  process  appears  at  least  once  in  the  sequence. 

A  minimal  round  is  a  round  such  that  no  proper  prefix  is  a 
round  (or,  in  other  words,  it  is  any  sequence  of  steps  such  that 
every  process  appears  at  least  once  and  at  least  one  process  appears 
just  once) .   The  run  time  for  a  finite  sequence  of  steps  is  defined 
to  be  the  number  of  segments  in  the  partition  into  minimal  rounds 
(The  last  segment  may  be  just  a  part  of  a  round) . 

Let  the  relative  speed  v  of  system  be  an  integer  v  >  0  such 
that  in  each  round  there  is  at  least  one  process  which  does  (at 
least)  V  steps. 

In  cases  in  which  v  changes  with  number  of  rounds  such  that 


V  (round  r+1)  =  [v  (round  r)J°''   ,  a  >  1 


where  a  is  an  integer  constant,  then  a  is  called  the  relative 
acceleration  of  the  system. 

We  shall  look  at  systems  which  are 

(1)  b-bounded  (i.e.   Vx,  locality (x)  £  b)   and 

(2)  of  relative  speed   v  >  1    or 

(3)  of  relative  acceleration  a  >  1. 

An  asynchronous  system  is  a  systera  (algorithm)  whose  allowable 
computations  are  all  its  infinite  admissible  computations  which 
conform  to  restrictions  such  as  (2)  or  (3)  abovG . 

2.2   THE  PROBLEM  (see  also  [Arjomandi,  Fischer,  Lynch,  81] 

We  shall  examine  the  [s ,n] -problem  again,  since  this  is, 
up  to  now,  the  only  problem  for  which  a  provable  difference 
between  synchronous  and  asynchronous  systems  has  been  noticed. 

Let  Y  c  X  be  a  distinguished  set  of  variables  (ports) . 
A  port-event  is  any  step  accessing  a  port.   A  session  is  any 
sequence  of  steps  containing  at  least  1  port  event  for  every  port. 
A  computation  performs  s  sessions  if  it  can  be  partitioned  into 
s  segments,  each  being  a  session.   An  infinite  computation  is 
ultimately  quiescent  if  it  contains  only  a  finite  number  of  port 
events.   The  time  to  quiescence  is  the  run-time  of  the  shortest 
prefix  of  the  computation,  containing  all  port  events. 

Let  s,n  G  M.   The  Is ,n] -session  problem  is  the  problem  of 
finding  a  concurrent  system  with  n  ports  such  that  every  allowable 
computation  performs  (at  least)  s  sessions  and  is  ultimately 
quiescent. 


2,3   Some  important  lemmas 

Let  b^2,  v(i)  >_   2         VieN. 

Let  R  =  {1,...,A}  be  a  set  of  round  numbers.   Let  P   be  the  set 

r 

of  processes  which  do  steps  in  round  r.   Let  D  be  a  sequence  of 
steps  corresponding  to  R  (The  same  step  may  appear  in  two  rounds 

or  even  into  the  same  round) .   Assume  for  every  round  r  e  R  there 

*  * 

is  exactly  one  process  p   e  P   such  that  Vp  7^  p   there  are  exactly 

v(r)  >  1  steps  a  e  D  such  that 

the  round  of  o  is  r  (round (a) =r) 

and 

the  process  of  o  is  p.  (process (a) =p) 

and 

there  is  exactly  one   o  g  D 

such  that 

* 
round (o)  =  r  and   process (a)  =  p 

The  above  restricted  systems  are  called  "fast  majority-slow 
minority  systems"  (FMSM-sy stems) . 

Let  us  define  a  partial  order  _<  such  that 

(a)  If  a£T  then  round  (a)  <_   round  (x ) 

(b)  If  a_<T  then  either 

variable (a)  =  variable (x) 
process (o)  =  process (x) 

(c)  If  a  £  X  then  a  precedes  x  in  D 

(d)  If  either  variable (a)  =  variable (x)  or  process (a)  =  process  (x) 
then   a,x  are  <  -  comparable. 


Definition     For  any   a  e  D 

dep(o)  A  {variable  (t)  :  tGD   and  a    <_   t} 

Lemma  1 

If  a,  £  a^      then  dep(o2)  ^  depCa^^) 

Proof  easy  D 

Lemma  2 

Let   a e  D,  round  (a)  =  r,  variable (a)  =  x 
(a) 


Let   C 


{t,  e  D:  round  (t-,)  =  r  and  a<T,  and 


process  (t,)  =  process (a)} 


and 


C^^      =  {  T2   €    D:    round(T2)    =    r+1    and   process (t2)€    locality (x)} 


Then 


dep(o)    c    {x}    u        u  r      u  dep(T   Hu        u  dep  (  t   ) 


11  2      e 


\r      ,         {variable(T   )  } 

^1^1 


2   2 


Proof 


By  induction  on  depth  of  <  and  beginning  with  maximal  elements.   D 


Lemma  3    For  each  a^D  it  is  the  case  that 

A 

|dep(a) I  <   ^   i    where   r  €  R 
~  r=l   ^ 

and  where  i   =  number  of  steps  s  in  round  r  such  that  o  <  s, 

Proof    Obvious. 


Let  us  now  consider  D  to  be  the  following  admissible  sequence: 
Processes  do  steps  in  round  robin  fashion,  except  for  one  process. 
The  rest  do  v(r)  round-robins  (in  round  r)  and  then  the  one 
(excluded)  process  does  one  step,  at  the  end  of  the  (big)  round. 

Let  us  consider  first  the  case  in  which  D  consists  of  only 
one  round.   Let  us  try  to  calculate  then  dep(o)  for  o  G  D. 
Assume  also  that  we  consider  only  "fast  majority-slow  minority" 
systems  with  speed  v(r)  in  round  r. 

If  we  for  the  moment  forget  about  the  last  step  of  the  single 
round  of  D,  then  D  can  be  decomposed  into  v(l)  rounds  for  the  rest 
of  the  processes.   In  these  rounds  the  speed  is  1,  and  according 
to  a  lemma  of  [Arjomandi,  Fischer,  Lynch,  81],  then  dep(a)  will  be 


b^(l)-l 
dep(o)  <_   — g3j — 


(The  last  step  in  the  single  round  of  D  cannot  change  the  depth, 
because  the  process  of  those  step  is  a"new"  process  and,  hence, 
the  only  way  for  its  step  to  depend  on  o  would  be  to  access  the 
same  variable.   This  clearly  does  not  add  to  dep(o)). 


The  above  argument  proves  that 


1,  < 


b-(^)-l 


1  -   b-1 


for  the  special  D  considered. 

Given  i  ,  in  the  best  case,  b'i   processes  will  have  read 
r  r 

these  variables  and  will  do  steps  dependent  on  a  in  the  round 

r+1. 

For  each  of  the  i   variables  (accessed  in  round  r  and  belonging 

to  steps  dependent  on  a)  ,  r—. new  variables  can  be  accessed 

in  round  r+1  for  the  special  form  of  D  considered ,  bv  the  above 
argument . 


This  implies 


v(r+i; 


<  1 


r+1  -  r     b-1 


which  furthermore  implies 


1   < 
r  — 


n  (b^^^^-i; 


(b-1)"  j=i 


Hence 


dep(o)  I  <_      I 
r=l 


1   n  (b^(^)_i)" 


(b-i)^  j=i 


Corollaries 


(i)   For  v(r)  =  v   (acceleration  =  1) 
we  have 

(v-1)  (A+l)-l 


|dep(o)  I  < 


b^^-l^-l 


=  0(b^^) 


2 

(ii)   For   v(r)  =  v  (r-1)       (acceleration  =  2\ 

we    get  , 

|dep(o)  I    <    0(b^    ^^^  ) 


(iii)       Also,    for   v(r)    =   v      Vr 


dep(a)|   <  n-1      implies        A  =   0{''"°^b"/v) 


2 
(iv)       For      v(r)    =   v    (r-1) 


dep(o) I    <    n-1      implies      A  =   0 


'^°%    l°%(lfl 


J 


2.4   The  lower  bound  theorem 

Theorem  1.   (Main  Result) . 

Assume  b,s,neN,  b  >_  2 .   For  every  b-bounded  asynchronous  system 
with  speed  restrictions  (e.g.  acceleration)  a>  1)  which  solves  the 
[s ,n] -session  problem,  there  is  an  allowable  computation  for  which 
the  time  to  quiescence  is  lower  bounded  by  a  function  of  n, 
depending  on  the  restrictions.   The  lower  bound  to  the  number  t 
of  minimal  rounds  needed  for  quiescence,  can  be  obtained  as  follows: 

Assume  that  we  can  partition  t  in  m(n,t)  nonoverlapping  segments 
of  sizes  a (1) , . . . , a (m) ,  such  that,  in  each  segment,  a  message  written 
on  any  variable  x  cannot  be  propagated  to  more  than  n-1  variables 
in  the  duration  of  the  segment.   (This  can  always  be  done  for  the 
systems  with  speed  restrictions  considered  in  this  paper) .   Then  a 
lower  bound  for  t  is  the  smallest  value  satisfying  m(n,t)  >  s. 


10. 

Proof.   Assume  an  asynchronous  system  which  solves  the  [ s ,n] -session 
problem.   Let  t  be  the  number  of  rounds  to  quiescence  for  a  particular 
computation  ct  which  is  as  close  to  round-robin  as  possible,  preserving 
the  speed  restrictions  of  the  system.   (For  example,  if  we  have 
"fast  majority-slow  minority"  speed  bounded  systems  with  relative 
speed  v,  then  the  computation  will  have  the  form  of  D  discussed 
previously)  .   Let  a  =  3y   where  Q   contains  the  first  t  rounds  of  ct  • 
Construct  a  new  infinite  admissible  computation  a'  =  B'y  where 
6'  is  a  reordering  of  the  steps  of  g  that  results  in  the  same  global 
state  as  B  but  performs  as  few  sessions  as  possible. 

To  construct  6'  we  first  define  a  partial  order  of  the  steps 
of  6  ,  representing  "dependency".   (Formally,  the  domain  of  the 


partial  order  consists  of  ordered  pairs  (i,}i)  where  }i  is  the 

< 

6 


i    step  of  B) •   For  every  pair  of  steps  O; t   in  g  we  let  o  <'  t 


if  a  precedes  t  in  B  and  either  process  (a)  =  process (t)  or 
variable (a)  =  variable (t).   Close   <'   under  transitivity  to  qet 

<„•<    is  a  partial  order  and  every  total  order  of  the  steps  of 

— p  — P 

6  consistent  with  <   is  a  computation  which  leaves  the  system 

— p 

in  the  same  global  state  as  B  •   (  B  itself  is  such  a  total  order) . 
Let  B  =  Bi • • • B   where  Bv  consists  of  a  (k)  minimal  rounds,  1  <  k  <  m, 
such  that  the  following  decomposition  is  possible: 

Let  y„  be  an  arbitrary  port.   For  k  =  l,...,m  we  define 
inductively  a  port  y,  and  two  sequences  of  steps  (p,     and  ijj,  as  follows 
(There  are  2  cases) . 


Case  1.   If  there  is  some  port  which  is  not  accessed  by  any  step 
of  Bi^  f  then  take  y,  to  be  that  port  and  let  (j),  =  null,  i|;,  =  Bv  • 

Case  2.   Else,  let  t,  be  the  first  step  in  Bu  which  accesses   yv_i 
Suppose  that  we  choose  a(k)  in  such  a  way  that   |dep(x,  )  I  <  n-1. 
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(This  is  possible  to  be  done  in  different  ways,  depending  on 

restrictions  in  speeds  and  access  of  variables.   One  such  way  was 

shown  in  [Arjomandi,  Fischer,  Lynch,  81].   For  speed-bound  systems 

log,  n 
(v(r)  =  V,  a=  1)  a(k)  is  0( )  and  for  acceleration-bound 

(a  >  1)    a  (k)  is  0  (log,  log  n.n)    in  view  of  the  corollaries  of 

b    V  ( 1 ) 

Section  2.3). 

Since  there  are  n  ports  in  total,  the  above  means  that  there 
is  one  port  y,  and  a  step  a,  such  that 

(i)  o,  is  the  last  step  in  g,  accessing   y, 

(ii)  It  is  false  that   i,  <   o, 

K  —  p   K 

Thus,  adding  the  relation   o^.  <  d'^v  ^°    "^ o    ^"^^  transitively 
closing,  we  get  another  partial  order   <,  .   Choose  any  total 
order  of  the  steps  in  6^  consistent  with  <,  .   Let  4),  be  the 
longest  prefix  of  the  ordering  not  containing  any  step  accessing 
y,  _,  and  i|;,  be  the  remainder. 

In  either  case  1  or  2 ,  c),  does  not  contain  any  step  accessing 
y,  _,  and  i|;,  does  not  contain  any  step  accessing  y,  .    Let 

1^1^2^2    ^m^m 

6'  is  consistent  with  B  but  it  contains  at  most  m  sessions,  since 
each  session  must  contain  steps  of  both  sides  of  some  4),  -  i|j, 
boundary. 
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We  want  m  >  s  in  order  for  the  computation  to  be  correct.   But 

m  =  m(n,t).   So,  the  lower  bound  for  t  is  the  minimum  t  satisfying 

m(n,t)  >  s.   (m  increases  with  t) . 

Examples 

In  general,  if  a(k)  =  a  (n)  for  every  k,  =>  m(n,t)  = — -, — r-  and  then  the 

a  i  n } 

lower  bound  is  t  >^  s'a(n).   For  speed-bound  (v,a  =  1)  systems, 

log,  n 
a(n)  =  0{ —   }    implying   t  ^  0 ( (s • log,  n) /v)  .   For  acceleration 

bound  (a  >  1)  systems,  a(n)  =  O(logj^  log^,^,n)  implying 

t  ^  0(s  logj^  log^^^^n)  . 

The  case  considered  in  [Arjomandi,  Fischer,  Lynch, 81]  was 

that  of  V  =  1.   The  lower  bound  follows  immediately. 

Remarks 

The  lower  bound  proof  works  independently  of  the  specific 
restrictions  in  the  form  of  the  admissible  computations,  as  soon 
as  they  exclude  the  case  in  which  the  depth  of  a  variable  in  a 
single  round  can  exceed  n  -  1. 


13, 
3.   THE  MODEL  WHERE  TRUE  CONCURRENCY  IS  ALLOWED 

3.1   The  Model 

Again,  a  concurrent  system  is  a  collection  P  of  processes 
and  X  of  shared  variables.   The  global  state  consists  of  the  in- 
ternal state  of  each  process  together  with  the  value  of  each 
shared  variable.   A  multistep  a  is  an  atomic  action  consisting 
of  simultaneous  changes  to  the  states  of  some  processes  and  the 
value  of  some  shared  variables,  i.e.  a  list  of  pairs 


(S^,V^)  (S2,V2)  ,...,  (Sg,V^)) 


where  S.  =  (s.,p.,t.)  where  s.,t.  are  states  of  process  p.  (in 
some  arbitrary  enumeration  of  processes)  before  and  after  the 
execution  of  the  multistep  o,  and  V.  =  (u,x,v)  where  u,v  are  the 
values  of  the  variable  x  before  and  after  the  execution  of  the 
multistep  o.   Let  us  indicate  x  by  var(V.)  and  p  by  proc(S.). 
For  every  multistep  a   the  following  restriction  must  hold: 

(Rl)   There  are  no  i ,  j   i  ?^  j  such  that  proc(S.)  -   proc{S.) 
or  var(V.)  -   var (V . ) , 

Such  a  concurrent  system  has  a  speed-bound  of  v  if  every  infinite 
computation  (sequence  of  multisteps)  can  be  partitioned  into  con- 
secutive nonoverlapping  segments,  of  v  multisteps  each,  which 
have  the  following  properties: 

(1)  For  every  process  p  of  the  system  there  is  at  least 
one  multistep  o  in  each  segment  such  that  p  =  proc(S.) 
for  some  (S.,V.)  of  the  multistep  a. 

(2)  For  every  segment  there  is  at  least  one  process  which 
participates  in  every  multistep  of  the  segment. 

It  may  seem  that  concurrent  systems  with  speed-bound  v  (as  above) 
and  concurrent  systems  of  relative  speed  v  (as  in  2.1)  are  very 
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similar.   However,  the  next  section  reveals  a  fundamental  dif- 
ference in  the  two  models: 

3.2   An  Algorithm  Which  Solves  the  [s ,n] -Session  Problem 
in  0 (s)  Time 

Let  a  minimal  round  be  a  sequence  of  multisteps  in  which 
each  process  of  the  system  appears  in  at  least  one  multistep  of 
the  sequence  and  the  sequence  if  the  minimal  with  such  property. 

The  following  simple  algorithm  solves  the  [s ,n] -session 
problem  in  0(s)  time  in  speed-bound  systems  with  speed  v:   each 
process  accesses  its  port  variable  for  vs  of  its  steps  and  then 
stops.   To  prove  that  the  above  solves  the  [s,n] -session  problem, 
it  is  enough  to  notice  that  v  steps  of  any  process  include  at 
least  one  minimal  round  which  includes  one  session  because  all 
port  variables  are  accessed.   Also,  v  steps  of  any  process  is  not 
more  than  v  minimal  rounds.   So,  the  total  time  to  (Quiescence  is 
vs  =  0(s).   However,  in  the  model  in  which  concurrency  was  rep- 
resented by  nondeterminism  in  step  sequences  of  individual  pro- 
cesses, every  system  of  relative  speed  v  takes  at  least  time 
0(   "b  /v)  to  solve  the  [s ,n] -session  problem,  for  at  least  one 
allowable  computation. 

Corollary  3.2 

The  two  models  of  concurrency  (nondeterminism  and  "true" 
parallelism)  are  not  equivalent. 
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